Decoding of MDP Convolutional Codes 
over the Erasure Channel 
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Abstract — This paper studies the decoding capabilities of maxi- 
mum distance profile (MDP) convolutional codes over the erasure 
channel and compares them with the decoding capabilities of 
MDS block codes over the same channel. The erasure channel 
involving large alphabets is an important practical channel model 
when studying packet transmissions over a network, e.g, the 
Internet. 

Keywords — Convolutional codes, maximum distance sepa- 
rable codes, parity check matrix, decoding, erasure channel, 
Reed-Solomon codes. 

I. Introduction 

When transmitting over an erasure channel like the Internet, 
one of the problems encountered is the delay experienced 
on the received information which is due to the possible re- 
transmission of lost packets. One way to eliminate these delays 
is by using forward error correction. 

Until now only block codes have been used for such a task, 
see [1], [2]. In this paper we demonstrate how maximum dis- 
tance profile (MDP) convolutional codes provide an attractive 
alternative. 

Convolutional codes have a certain flexibility given by the 
"sliding window" characteristic. This means that the received 
information can be grouped in blocks or windows in many 
ways, depending on the erasure bursts, and then be decoded 
by decoding the "easy" blocks first. This flexibility in group- 
ing information brings certain freedom in the handling of 
sequences; we can split the blocks in smaller windows, we 
can overlap windows, etc., we can proceed to decode in a less 
strict order The blocks are not fixed as in the block code case, 
i.e., they do not have a fixed grouping of a fixed length. We 
can slide along the transmitted sequence and decide the place 
where we want to start our decoding depending on the erasure 
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occurrence. This property allows us to correct in a given block 
more erasures than a block code of that same length could do. 

An [A^, K] block code used for transmission over an erasure 
channel can correct up to iV— A' erasures in a given block. The 
optimal error capability of — X is achieved by an [N, K] 
maximum distance separable (MDS) code. 

As an alternative consider now a class of {n, k, 6) convo- 
lutional codes, i.e., a class of rate k/n convolutional codes 
having degree S. We will demonstrate that for this class, the 
maximum number of errors which can be corrected in some 
sliding window of appropriate size is achieved by the subclass 
of MDP convolutional codes. In this paper, we will study the 
maximum number of erasures that such a class of codes can 
decode and the conditions under which this happens. Moreover 
we will show that over the erasure channel this class of codes 
can decode extremely efficiently. 

The paper is organized as follows. Section provides 
the necessary background for the development of the paper. 
Thus, subsection III- Al explains the assumptions on the channel 
model; subsection III-BI provides all the necessary concepts 
about MDP convolutional codes and their characterizations. 
Section |III] is the main part of the paper It contains our main 
result and describes in detail the decoding procedure. It also 
provides examples and special concerns to be noticed when 
comparing with MDS block codes, and in particular with 
Reed-Solomon codes. Section |IV] shows a decoding method 
in which the transmitted information is recovered directly. 

II. Preliminaries 

A. Erasure channel 

An erasure channel is a communication channel where the 
symbols sent either arrive correctly or the receiver knows that 
a symbol has not been received or was received incorrectly. An 
important example of an erasure channel is the Internet, where 
packet sizes are upper bounded by 12,000 bits - the maximum 
that the Ethernet protocol allows (that everyone uses at the 
user end). In many cases, this maximum is actually used. Due 
to the nature of the TCP part of the TCP/IP protocol stack, 
most sources need an acknowledgment confirming that the 
packet has arrived at the destination; these packets are only 320 



bits long. So if everyone were to use TCP/IP, the packet size 
distribution would be as follows: 35% -320 bits, 35% - 12,000 
bits and 30% - in between the two, uniform. Real-time traffic 
used, e.g., in video calling does not need an acknowledgment 
since that would take too much time; overall, the following is 
a good assumption of the packet size distribution: 30% - 320 
bits, 50% - 12,000 bits, 20% -in between, uniform. 

We can model each packet as an element or sequence of 
elements from a large alphabet. Since packets over the Internet 
are usually protected by a cyclic redundancy check (CRC) 
code the receiver knows when a packet is in error or has not 
arrived. For the purpose of illustration we could employ as 
alphabet the finite field F := F2i,ooo. If a packet has less than 
1,000 bits then one uses simply the corresponding element 
of F. If the packet is larger one uses several alphabet symbols 
to describe the packet. Even if one uses some interleaving, 
such an encoding scheme results in the property that errors 
tend to occur in bursts and this is a phenomena observed about 
many channels modeled via the erasure channel. This point is 
important to keep in mind when designing codes which are 
capable of correcting many errors over the erasure channel. 

B. MDP convolutional codes 

Let F be a finite field. We view a convolutional code C with 
rate k/n as a submodule of F"[2] (see [4], [11], [12]) that can 
be described as 

C ^ {v{z) eF'[z]\v{z) ^G{z)u{z) with u{z) e¥''[z]} 

where G{z) is a n x k polynomial matrix called a generator 
matrix for C, u{z) is the information vector and v{z) is the 
code vector or codeword. 

We define the degree of a convolutional code C, and 
we denote it by S, as the maximum of the degrees of the 
determinants of the k x k sub-matrices of any generator 
matrix of C. Then we say that C is an {n, k, 6) convolutional 
code [10]. 

In case the convolutional code C is also observable (see, 
e.g., [11], [14]) then C can be equivalently described through 
a parity check matrix. In other words, there exists in this case 
an (n — fc) X n full rank polynomial matrix H{z) such that 

C = {v{z) e¥"[z] I H{z)v{z) = e¥''-''[z]} . 

If we write v(z) = vq + viz + . . . + vjz' (with I > 0) and 
we represent H{z) as a matrix polynomial 

H{z)^Ho+Hiz + ... + H„zr 

we can expand the kernel representation in the following way 

Ho 



Hn 



Hn 



Vo 
Vl 



(1) 



An important distance measure for convolutional codes is 
the free distance: 

d,„(C) := min{wt('u(z)) | v{z)eC and t;(z) ^ 0} . 

The following lemma shows the importance of the free dis- 
tance as a performance measure of a code used over the erasure 
channel. 

Lemma 2.1: If C is a convolutional code with free distance 
d d[,^^ and if during transmission at most d — 1 erasures 
occur then these erasures can be uniquely decoded. Moreover, 
there exist patterns of d erasures which cannot be uniquely 
decoded. 

Proof: Let v{z) = Vo+Viz+. . ■+viz^ be a received vec- 
tor with d—1 symbols erased. Let the erasures be in positions 
ii, . . . , id-i- The homogeneous system ([T]i of {i'+l + l){n — k) 
equations with (/ + l)n unknowns can be changed into an 
equivalent nonhomogeneous system 



H 



of {i' + I + l){n — k) equations with d—1 unknowns 

This nonhomogeneous system has a solution, because of the 
assumption that the channel allows only erasures. In addition 
the columns of the system matrix are linearly independent, 
because d — (ir,„(C), so the matrix H is full column rank. It 
follows from these two facts that the solution must be unique. 

■ 

Rosenthal and Smarandache [13] showed that an {n,k,S) 
convolutional code has a free distance upper bounded by 



rfr„(C) <{n~k) 



1 



1. 



(2) 



This bound is known as the generalized Singleton bound [13] 
since it generalizes in a natural way the Singleton bound for 
block codes. Analogously, we say that an {n, k, 5) code is a 
maximum distance separable convolutional code (MDS) [13] 
if its free distance achieves the generalized Singleton bound. 

Another local distance measure, important as well for 
decoding and related with the previous one, is the column 
distance [7], d^{C), given by the expression 

^^(C) = min {wt(v[oj](2;)) | v{z)eC and vq ^ 0} 



where V[o,j] (z) 



Vo + viz 



represents the jth 



truncation of the codeword v{z) G C. It is related with the 
(ir„(C) in the following way 

4cc(C) = lim d^JC). (3) 

The j-th column distance is then upper bounded by 

dUC)<{n-k){j + l) + l (4) 



and the maximality of any of the column distances implies the 
maximality of all the previous ones, that is, if dj{C) = (n — 
A:)(j + 1) + 1 forsome j, then d^(C) = {n-k){i+l) + l for i < 
J, see [3], [5]. The (m + l)-tuple (dg(C), (C), . . . ,d^;,(C)) 
is called the column distance profile of the code [7]. 

Since no column distance can achieve a value greater than 
the generalized Singleton bound, the largest integer for which 
that bound can be attained is 



L 



5 




5 
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n — k 



(5) 



An (n, k, S) convolutional code C is maximum distance 
profile (MDP) [3], [5], if dl{C) = (n - k){L + 1) + 1. In 
this case, every rf^(C) for j < L is maximal, so we can say 
that the column distances of MDP codes increase as rapidly 
as possible for as long as possible. 

In order to characterize the column distances as well as 
MDP codes algebraically assume the parity check matrix is 
given as H{z) 
and define: 



J2i=o HiZ^. For each j > v define Hj — 



Hi Hq 



Hn 



gf 



(i+l)(n-fe)x(j + l)n 



(6) 

Then we have: 

Theorem 2.2: ( [3, Proposition 2.1]) Let d G N. Then the 
following properties are equivalent. 

(a) = d; 

(b) none of the first n columns of TLj is contained in the span 
of any other d—2 columns and one of the first n columns 
of TLj is in the span of some other d—1 columns of that 
matrix. 

As a consequence we have the algebraic characterization of 
MDP convolutional codes: 

Theorem 2.3: ( [5, Theorem 3.1]) The j-th column distance 
attains the maximum value 



d^ = (n-A;)(j + !) + !, 



(7) 



if and only if, every {j + l){n — k) x (j + l)(n — k) full-size 
minor of Hj formed from the columns with indices 1 < ii < 
• ■ ■ < »(i+i)(n-fc)' where is(n-fc) < sn for s = 1, . . . , j, is 
nonzero. 

In particular when j = L, then H{z) represents an MDP 
code, if and only if, every (L + l)(n — fc) x {L + l){n — k) 
full-size minor of Ti^ formed from the columns with indices 
1 < ii < • • ■ < i(L+i)(n-k), where is(n~k) < sn for s 
1, . . . , L, is nonzero. 

MDP convolutional codes can be thought to be like an MDS 
block code within windows of size {L + l)n. The nonsingular 
full-size minors property given in the previous theorem ensures 
that if we truncate a codeword at iterations up to L it will have 
weight higher or equal than the bound 



III. Decoding over an erasure channel 

Let us suppose that we use an MDP convolutional code C 
to transmit over an erasure channel. Then we can state the 
following result. 

Theorem 3.1: Let C be an {n,k,S) MDP convolutional 
code. If in any sliding window of length {L + l)n at most 
{L + l){n ~ k) erasures occur then we can recover the whole 
sequence. 

Proof: Assume that we have been able to correctly 
decode up to an instant t — 1. Then we have the following 
homogeneous system : 



Ho 
Hi 

Hl 



Ho 

Hi Ho 



Vt_l 

★ 



(8) 



where * takes the place of a vector that had some of the com- 
ponents erased. Let the positions of the erased field elements 
be ii, . . . , le, e < (n — + 1), where ii, . . . ,is, s < n, 
are the erasures occurring in the first erased n-vector. We can 
compute the syndrome and get a nonhomogeneous system with 
(L + l)(n — k) equations and e, at most (L + l)(n — fc), 
variables. 

We claim that there is an extension {vt, . . . , Vi+i} such 
that the vector {vt-i, . . . vt-i ^t, ■ ■ ■ ,'^t+L) is a codeword 
and such that Vf is unique. 

Indeed, we know that a solution of the system exists since 
we assumed only erasures occur To prove the uniqueness of 
Vf , or equivalently, of the erased elements Vi^, . . . ,Vi^, let us 
suppose there exist two such good extensions {vt, . . . , vi+l} 
and {vf , . . . , Vf+i}. Let h^^ , . . . , hi_. , be the column vectors 
of the sliding parity-check matrix in (|8j which correspond to 
the erasure elements. We have: 



and 



. Vi^ h,;^ + . . . + Vi^ hi 



Vi.h-i 



hi. =h. 



where the vectors b and b correspond to the known part of the 

system. Subtracting these equations and observing that b = b, 
we obtain: 

(Sil — Vii)iii-^ + . . . + {Vi^ — + ■ ■ ■ + (file ^ "ijhic = 0. 



Using Theorem 12.21 for a window of size L, and using that 
the code is MDP, so d1 = {L + l){n-k) + 1, we obtain that, 
necessarily, 

Vii -Vii =0, . . . , Vi^ ~ Vi^ = 0, 

by part (b) of Theorem 12.21 This concludes the proof of our 
claim. 

In order to find the value of this unique vector, we solve 
the full column rank system, find a solution and retain the part 



which is unique. Then we sUde n bits to the next n{L + 1) 
window and proceed as above. 



A. Examples and Remarks 

Remark 3.2: The decoding algorithm requires only simple 
linear algebra. For every {n — k) erasures a matrix of size at 
most (L + l)(n — k) has to be inverted over the base field F. 
This is easily achieved even over fairly large fields. 

In addition one should notice that for a rate - MDP 
convolutional code, 100 • ^^^^ percent of the erasures can be 
corrected. 

Remark 3.3: Theorem l3.1l is optimal in a certain sense: One 
can show that for any (n, k, S) code there exist patterns of 
(L + 2)(n — fc) erasures in a sliding window of length {L + 2)n 
which cannot be uniquely decoded. 

The following illustrative example compares the size of a 
particular MDP convolutional code with an MDS block code 
which would perform similarly. 

Example 3.4: Let us take a (2,1,50) MDP convolutional 
code to decode over an erasure channel. In this case the 
decoding can be completed if in any sliding window of length 
202 there are not more than 101 erasures; 50% of the erasures 
can be recovered. 

The MDS block code which achieves a comparable perfor- 
mance is a [200, 100] MDS block code. In a block of 200 
symbols we can recover 100 erasures, that is again 50%. 

Remark 3.5: It has been noticed that the parameter L gives 
us an upper bound on the length of the window we can take 
to correct, but it should be noticed as well that the property 
of Theorem 12.31 holds for every j < L. This means that we 
can take smaller windows to set our systems (the size will be 
conveniently decided by the distribution of the erasures in the 
sequence). Then in a window of size {j + l)n symbols we can 
recover at most (j + l)(n — fc) erasures. 

This property allows us to recover the erasures in situations 
where the MDS block codes cannot do it. For example, assume 
that we have been able to correctly decode up to an instant t 
and then it comes a block of 200 symbols where 2 bursts of 
60 erasures occur separated by a block of 80 clean symbols, 
and after it, clean symbols again. 



the received sequence with this 120 symbols window until we 
set the rest of the erasures in the same way. 



60 



60 



★ ★ . . . ★★W61W62 • ■ • Wl40 ** ■ • ■ **t'201t'202 • ■ • 

In this situation 120 erasures happen in a block of 200 symbols 
and the MDS block code is not able to recover them. In the 
block code situation one has to skip the whole block losing 
that information, and go on with the decoding. 

However, the MDP convolutional code can deal with this 
situation. Let us set a 120 symbols length window; in these 
windows we can correct up to 60 erasures. We can take 100 
previous decoded symbols, then set a window with the first 
60 erasures and 60 more clean symbols. In this way we can 
recover the first block of erasures. Then we can slide through 



60 



■ Wl40** ■ • ■ **t'20lW202 . 



■ ^260 



After this we have correctly decoded the sequence. 

Remark 3.6: Another advantage to remark is related to the 
storage and to the field size required to construct the codes. 
In the example, we propose we have a [200, 100] MDS block 
code. If we take, for example, a Reed-Solomon code (one of 
the most widely used MDS block codes) then we need to store 
the 200 roots of a 200 degree polynomial to set the code. That 
is, we need at least 200 field elements. 

However, to set the (2, 1, 50) MDP convolutional code we 
need to store the coefficients of 2 polynomials of degree 50, 
that is at least 100 different elements. 

Nevertheless there are some disadvantages. On the one 
hand, the storage and the field size are smaller, but on the 
other hand, there are not direct constructions for the case of 
MDP convolutional codes. This is still an open problem. 

The construction of MDP convolutional codes has been 
developed somewhat [6], however there exists still no efficient 
algorithm to construct this class of codes. In relation to this 
problem, special type of matrices called superregular matrices 
proved to be relevant during this study and this topic has 
become of main importance when trying to construct MDP 
convolutional codes [5], [8]. 

If we denote by T^l'"'']l the r xr submatrix obtained from 
a matrix T € F"^" by taking the rows with indices ii, . . . ,ir 
and the columns with indices ji, . . . ,jr, then we can define a 
superregular matrix as follows. 

Definition 3.7: [5] A lower triangular Toepliz matrix T 



T = 













h 



e F" 



(9) 



is said to be superregular if Tj^'"''j'' is nonsingular for all 
1 < r < n and all indices 1 < ii < ■ ■ ■ < ir < n, 1 < ji < 
■ ■ ■ < jr < which satisfy jg < is for s = 1, . . . , r. The 
submatrices obtained by picking such indices are called the 
proper submatrices and their determinants the proper minors 
of T. 

Unfortunately, the characterization or construction of these 
matrices is a hard problem and more research is needed in this 
direction in order to come up with a construction for MDP 
convolutional codes. 

IV. Decoding with the help of the generator 

MATRIX 

In this section we explain how the use of the generator 
matrix of the MDP code can make our decoding process more 
efficient and faster. 



We know that the encoding process is represented by 
G{z)u{z) ~ v{z), so the idea is to use this relation to recover 
directly the original message u{z) instead of computing first 
the code sequence and then decode it into the original sequence 
u{z), as we did before when working with the parity check 
matrix. 

In an analogous way to the parity check matrix we can 
expand the generator matrix into 

Giz) ^ Go + Giz + . . . + Gmz"" 



and define Qj as 



Go 

Gi Go 



G G 



Go 



(10) 



Then the equivalences in the following theorem give us the 
properties to improve the decoding algorithm. 

Theorem 4.1: ( [3, Theorem 2.4]) Let Tij and Qj be as in 
(|6]l and ( fTOl l. Then the following are equivalent: 

1) d<^^ - + + l 

2) every (j + l)fc x (j + l)fc full-size minor of Qj formed 
from the columns with indices 1 < ii < . . . < 
where tsk+i > sn for s = 1, . . . , j, is nonzero. 

One notices that for an MDP convolutional code the maximum 
size of the matrix Qj we can construct is again given by 
the parameter L. This tells us that the maximum number of 
original symbols we are able to recover in one time is {L+l)k. 
Since in any sliding window of length {L + l)n not more than 
(i+l)(n— fc) erasures occur we can set a full rank system with 
at least {L + l)k equations to recover the {L + l)k symbols of 
the original sequence. We will leave the details to the reader. 
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V. Conclusion 

In this paper, we propose MDP convolutional codes as an 
alternative to block codes when decoding over an erasure 
channel. We have seen that the step-by-step-MDS property 
of the MDP codes lets us recover the maximum number 
of erasures at every step. Even over large field sizes the 
complexity of decoding is polynomial for a fixed window size 
since the decoding algorithm requires the solving of some 
linear system only. Moreover, the sliding window property 
allows us to adapt the decoding process to the distribution 
of the erasures in the sequence. We have shown how the 
possibihty of taking smaller windows lets us recover erasures 
that the block codes cannot recover. 
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